Approximate Frequent Pattern Mining
نویسندگان
چکیده
Frequent pattern mining has been a focused theme in data mining research and an important first step in the analysis of data arising in a broad range of applications. The traditional exact model for frequent pattern requires that every item occurs in each supporting transaction. However, real application data is usually subject to random noise or measurement error, which poses new challenges for the efficient discovery of frequent pattern from the noisy data. Mining approximate frequent pattern in the presence of noise involves two key issues: the definition of a noise-tolerant mining model and the design of an efficient mining algorithm. In this paper, we will give an overview of the approximate itemset and sequential pattern mining.
منابع مشابه
REAFUM: Representative Approximate Frequent Subgraph Mining
Noisy graph data and pattern variations are two thorny problems faced by mining frequent subgraphs. Traditional exact-matching based methods, however, only generate patterns that have enough perfect matches in the graph database. As a result, a pattern may either remain undetected or be reported as multiple (almost identical) patterns if it manifests slightly different instances in different gr...
متن کاملMACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance
Consecutive pattern mining aiming at finding sequential patterns substrings, is a special case of frequent pattern mining and has been played a crucial role in many real world applications, especially in biological sequence analysis, time series analysis, and network log mining. Approximations, including insertions, deletions, and substitutions, between strings are widely used in biological seq...
متن کاملMining Top-k Approximate Frequent Patterns
Frequent pattern (itemset) mining in transactional databases is one of the most well-studied problems in data mining. One obstacle that limits the practical usage of frequent pattern mining is the extremely large number of patterns generated. Such a large size of the output collection makes it difficult for users to understand and use in practice. Even restricting the output to the border of th...
متن کاملMining approximate patterns with frequent locally optimal occurrences
We propose a novel frequent approximate pattern mining that suits estimation of occurrence regions. Given a string s, our mining enumerates its substrings that locally optimally match many substrings of s. We show an algorithm for this problem in which candidate patterns are generated without duplication using the suffix tree of s. This problem can be extended to the problem of enumerating appr...
متن کاملMarkov Models in the Analysis of Frequent Patterns in Financial Data
Frequent sequence mining is one of the main challenges in data mining and especially in large databases, which consist of millions of records. There is a number of different applications where frequent sequence mining is very important: medicine, finance, internet behavioural data, marketing data, etc. Exact frequent sequence mining methods make multiple passes over the database and if the data...
متن کامل